Multi-modal video retrieval using Dilated Pyramidal Residual network
نویسندگان
چکیده
منابع مشابه
Multi-modal query expansion for video object instances retrieval
In this paper we tackle the issue of object instances retrieval in video repositories using minimum information from the user (e.g., textual description/tags). Starting for a set of tags, images containing the object of interest are crawled from popular image search engines and repositories (e.g., Bing, Fickr, Google) and the positive and most representative instances of the object are automati...
متن کاملMulti-modal Classifier Fusion for Video Shot Content Retrieval
In this paper we present a new chromosome to solve the problem of classifier fusion using genetic algorithm. Experiments are conducted in the context of TRECVID. In particular we focus on the feature extraction task that consists in retrieving video shots expressing one of predefined semantic concepts. Three modalities (visual, textual and motion) and two features per modality are used to descr...
متن کاملDilated Residual Network for Image Denoising
Variations of deep neural networks such as convolutional neural network (CNN) have been successfully applied to image denoising. The goal is to automatically learn a mapping from a noisy image to a clean image given training data consisting of pairs of noisy and clean image patches. Most existing CNN models for image denoising have many layers. In such cases, the models involve a large amount o...
متن کاملNews Video Retrieval using Multi-modal Query-dependent Model and Parallel Text Corpus
This paper describes a fully automated news video retrieval system that is capable of retrieving relevant shots using a multimedia query. The emphasis we adopted is three-fold. First, we explore the use multi-modal features such as speaker identification, video OCR, face recognition and Name-entities in ASR text, along with pseudo relevance feedback, for video retrieval. Second, we employ query...
متن کاملMulti-Modal Face Tracking Using Bayesian Network
This paper presents a Bayesian network based multimodal fusion method for robust and real-time face tracking. The Bayesian network integrates a prior of second order system dynamics, and the likelihood cues from color, edge and face appearance. While different modalities have different confidence scales, we encode the environmental factors related to the confidences of modalities into the Bayes...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Science and Technology Development Journal - Natural Sciences
سال: 2019
ISSN: 2588-106X,2588-106X
DOI: 10.32508/stdjns.v2i5.789